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ABSTRACT 

Block diagonalization is a linear preceding technique for the mul- 
tiple antenna broadcast (downlink) channel that involves transmis- 
sion of multiple data streams to each receiver such that no multi- 
user interference is experienced at any of the receivers. This low- 
complexity scheme operates only a few dB away from capacity but 
does require very accurate channel knowledge at the transmitter, 
which can be very difficult to obtain in fading scenarios. We con- 
sider a limited feedback system where each receiver knows its chan- 
nel perfectly, but the transmitter is only provided with a finite num- 
ber of channel feedback bits from each receiver Using a random 
vector quantization argument, we quantify the throughput loss due 
to imperfect channel knowledge as a function of the feedback level. 
The quality of channel knowledge must improve proportional to the 
SNR in order to prevent interference-limitations, and we show that 
scaling the number of feedback bits linearly with the system SNR is 
sufficient to maintain a bounded rate loss. Finally, we investigate a 
simple scalar quantization scheme that is seen to achieve the same 
scaling behavior as vector quantization. 

Index Terms — MIMO systems, Broadcast channels. Quantiza- 
tion, Finite Rate Feedback, Multiplexing Gain 

1. INTRODUCTION 

In multiple antenna broadcast (downlink) channels, transmit antenna 
arrays can be used to simultaneously transmit data streams to re- 
ceivers and thereby significantly increase throughput. Dirty paper 
coding (DPC) is capacity achieving for the MIMO broadcast chan- 
nel 0, but this technique has a very high level of complexity. Zero 
Forcing (ZF) and Block Diagonalization (BD) 1 2 1 1 3 1 are alterna- 
tive low-complexity transmission techniques. Although not optimal, 
these linear preceding techniques utilize all available spatial degrees 
of freedom and perform measurably close to DPC in many scenarios 

H 

If the transmitter is equipped with M antennas and there are 
at least M aggregate receive antennas, zero-forcing involves trans- 
mission of M spatial beams such that independent, de-coupled data 
channels are created from the transmit antenna array to M receive 
antennas distributed amongst a number of receivers. Block diago- 
nalization similarly involves transmission of M spatial beams, but 
the beams are selected such that the signals received at different re- 
ceivers, but not necessarily at the different antenna elements of a 
particular receiver, are de-coupled. For example, if there are M/2 
receivers with two antennas each, then two beams are aimed at each 
of the receivers. If ZF is used, an independent and de-coupled data 



stream is received on each of the M antennas. If BD is used, the 
streams for different receivers do not interfere, but the two streams 
intended for a single receiver are generally not aligned with its two 
antennas and thus post-multiplication by a rotation matrix (to align 
the streams) is generally required before decoding. 

In order to correctly aim the transmit beams, both schemes re- 
quire perfect Channel State Information at the Transmitter (CSIT). 
Imperfect CSIT leads to incorrect beam selection and therefore mul- 
tiuser interference, which ultimately leads to a throughput loss. Un- 
like point to point MIMO systems where imperfect CSIT causes only 
an SNR offset in the capacity vs. SNR curve, the level of CSIT 
affects the slope of the curve and hence the multiplexing gain in 
broadcast MIMO systems. We consider the case when the CSI is 
known perfectly at the receiver and is communicated to the transmit- 
ter through a finite rate feedback channel and quantify the maximum 
rate loss due to finite rate feedback with BD. MISO systems and ZF 
with finite rate feedback are analyzed in |5 1. Similar to the results in 
|5|. we show that scaling the number of feedback bits approximately 
linearly with the system SNR is sufficient to maintain the slope of 
the capacity vs. SNR curve and hence a constant gap from the ca- 
pacity of BD with perfect CSIT. The scaling factor for BD offers an 
advantage over ZF in terms of the number of bits required to achieve 
the same sum capacity. Finally, we investigate a simple scalar quan- 
tization scheme and see that this low complexity scheme requires the 
same feedback scaling. 

2. SYSTEM MODEL 

We consider a single transmitter and K user MIMO system where 
each user has N antennas and the transmitter has M antennas. The 
broadcast channel is described as: 

y. =Hfx + n,, i = l,...,K (1) 

where H, G j^Afxiv jj^g channel matrix from the transmitter to 
the i* user (I < i < K) and the vector x G c^fxi irwis- 
mitted signal, G C'^^^ are independent complex Gaussian noise 
vectors of unit variance and G C^^^ is the received signal vec- 
tor at the i* user. We assume a transmit power constraint so that 
<P{P> 0). We also assume that K > lunAK = ^, 
which implies that the aggregate number of receive anteimas equals 
the number of transmit antennas; as a result it is not necessary to 
select a subset of receivers for transmission. 

The entries of Hi are assumed to be i.i.d. unit variance complex 
Gaussian random variables, and the channel is assumed to be block 
fading with independent fading from block to block. Each of the re- 
ceivers is assumed to have perfect and instantaneous knowledge of 



their own channel matrix. The channel matrix is quantized at each 
receiver and fed back to the transmitter (which has no other knowl- 
edge of the instantaneous CSI) over a zero delay, error free, finite 
rate channel. In order to perform BD, it is only necessary to know 
the spatial direction of each receiver's channel, i.e., the subspace 
spanned by Hi, and thus the feedback only conveys this informa- 
tion. 



2.1. Finite Rate Feedback Model 

The quantization codebook used at each receiver is fixed beforehand 
and is known to the transmitter and each receiver. A quantization 



codebook C consists of 2^ matrices in 5 



(Wi 



where B is the feedback bits per user. The quantization of a channel 
matrix Hi, say Hi, is chosen from the codebook C according to: 



Hi — argmin d (Hi,W) 
w G c 



(2) 



where ci(Hi,W) is the distance metric. Here, we consider the 
chordal distance (5) : 



d(H„W) = 



\ 



(3) 



where the Si's are the principal angles between the two subspaces 
spanned by the columns of the matrices. As the principal angles 
depend only on the subspaces spanned by the columns of the matri- 
ces, it can be assumed that the elements of C unitary matrices. No 
channel magnitude information is fed back to the transmitter. 



2.2. Random Quantization CodebooliS 

Since the design of optimal quantization codebooks for the given 
distance metric is a very difficult problem, we instead study perfor- 
mance averaged over random quantization codebooks. The Grass- 
mannian manifold is the set of all A'^ dimensional subspaces in an 
M dimensional Euclidean space, and is denoted by Gm,n(C). Each 
of the 2^ matrices making up the random quantization codebook is 
chosen independently and uniformly distributed over Gm,n(C), and 
each matrix can be assumed to be unitary (points in Gm.n(C) are 
equivalence classes of orthonormal matrices in C^^'*'^). We ana- 
lyze the performance averaged over all possible random codebooks. 
The distortion or error associated with a given codebook C for the 
quantization of H G ij^MxN (jgfined as: 



D = E 



[rf'(H,H)] 



= E 



min d^(H,W) 

wee ^ ' 



(4) 



where H is the quantization of H. It is shown in f71 D satisfies: 

D < ^^(CM]v)"'r'2-* +iVexp[-(2^CAfiv)'""] =D 

(5) 

for a codebook of size 2^, where T = N{M - A^) and a € (0, 1) is 
areal number between and 1 chosen such that (CAfjv2^) ^ < 1. 



C^ 



is given by Yl (jv-ij! • second (exponential) term in 



Js} for the expression of D can be neglected for large B. 



2.3. Blocli Diagonalization 

The Block Diagonalization strategy when perfect CSI is available 
at the transmitter involves precoding the signals to be transmitted 
in order to suppress interference at each user due to all other users 
(but not due to different antennas for the same user). If Si G C^^^ 
contains the TV complex symbols intended for the (1 < i < K) 
user and Vi £ 



is the precoding matrix, then the transmitted 



vector is given by: 



and the received signal at the i user is given by: 



yi = 



P H 

-Hf V,s, + 




+ Hi 



(6) 



(7) 



It is assumed that a uniform power allocation strategy among 
users is employed (due to absence of channel magnitude informa- 
tion at the transmitter). Furthermore, in order to maintain the power 
constraint it is assumed that VfVi — In and _E[| |si | < 1. 

Following the BD strategy, each Vi is chosen such that Hj^V; 
is 0, Vi 7^ j. This amounts to determining an orthonormal basis 
for the null space of the matrix formed by stacking all H_, , j 7^ i 
matrices together. This reduces the interference terms in equation 
to zero at each user. This is different from Zero Forcing where each 
complex symbol to be transmitted to the m"' antenna (among the A'^ 
antennas) of the i"" user is precoded by a vector that is orthogonal 
to all the columns of Hj^i as well as orthogonal to all but the m* 
column of Hi. 

However, perfect knowledge of the Hi 's at the transmitter is re- 
quired for zero interference. When finite rate feedback is employed, 
each Vi is chosen such that Hj^Vi — 0\/i ^ j which is 7^ UfVi 
in general, and leads to a loss in throughput. 



3. THROUGHPUT ANALYSIS 
3.1. Fixed Feedbaclt Quality 

In the case of perfect CSIT and BD, the transmitter has the ability to 
suppress all interference terms giving a per user ergodic capacity of: 



RBoiP) = Eh 



In + ^ H^VsdVIbH 



(8) 



where Vsd is the precoding matrix chosen by the BD procedure 
given the channels of all the users. The expectation is carried out 
over all channels H. 

For finite rate feedback of B bits per user, multiuser interference 
cannot be perfectly canceled and leads to additional noise power. 
Taking this interference into account, the per user throughput is: 



RpBiP) = Eh, 



log2 



loga 



^^+kJ2 Hf V, Vf H. 



K 



^^+K E HfV.VfH, 



(9) 



where the expectation is carried out over all channels as well as ran- 
dom codebooks (i is any user between 1 and K). 



Theorem 3.1. The rate loss per user incurred due to finite rate feed- 
back with respect to perfect CSIT using Block Diagonalization can 
be bounded from above by: 

ARiP) = [Rbd{P) ~ RpBiP)] 
< iV log2(l + P-D) 

Proof AR{P) = [Rbd{P) - Rfb{P)] 



< E„ 

Eiii.C 



log 



In + ^ H^VbdVIoH 



Hi,C 



E, 



Hi,C 



log 



log 



log 



log 



In + Hf V.Vf H,: 



K 



+ 



K 



iN + ^Hf ( E^^ V? j H'^' 



< log 



In + 



P(K-l) 
K 



Eu.c [nf 



V,Vf H J M 



Here, bound (a) follows by neglecting the positive semi definite 
interference terms. Both Vsd and Vi are uniformly distributed 
and independent of Hi which results in (b). We write HiHf = 
HiAiHf^ where Hi forms an orthonormal basis for the subspace 
spanned be the columns of Hi and Ai = diag[Ai, . . . , Ajv] are the 
TV non-zero unordered eigen values of HiHf^ (assuming Hi is of 
rank A'^ and diagonalizable) where E [Ai] is Afljv, and (c) follows. 
The bound (d) follows from Jensen's inequality due to the concavity 

of log I ■ |. It can also be shown that Eh,c [fif (Vj Vf ) Hij = 
jg£-j^, which provides a bound on the rate loss per user. D can be 
upper bounded by D from Jsj for large enough B. □ 

3.2. Increasing Feedback Quality 

Theorem 3.2. In order to maintain a rate loss AR(P) of no larger 
than log2 (b) > per user, it is sufficient for the number of feedback 
bits per user to be scaled with SNR as: 



B : 



N{M~N) 
3 



PdB 

N) log2 



N(M 
r( 



iV)log2(fe- -1) + 



N{M-N) 



N(M-N) 



- log2(CA 



(10) 



This expression can be found by equating the upper bound on 
rate loss with logj b and solving for i? as a function of P. Solving 
this numerically will yield the number of bits strictly sufficient for a 
maximum rate loss of log2b. We assume that B is large enough to 
neglect the exponential term in the expression for D from ^5} which 
yields the above approximation. The total contribution of the term 
containing the logarithm of the gamma function is less than a bit and 
it can usually be neglected. To maintain a system throughput loss of 
M bps/Hz, which corresponds to an SNR gap of no more than 3 dB 
with respect to BD with perfect CSIT, it is sufficient to scale the bits 
as: 

^^^\~^^ PdB - log2 (CMiv) (11) 



The factor of N{M — A'^) suggests that the number of feed- 
back bits per antenna reduce with increasing A'^. The number of bits 
can grow very large for MIMO broadcast systems, and simulation 
becomes a computationally complex task. However, utilizing the 
statistics of random codebooks, systems with a small number of an- 
tennas can be simulated in a reasonable amount of time. We present 
simulation results for M — 8 and TV = 2 in Figure 0a) while 
scaling the bits as per <1 1> . As Theorem 13.21 only provides the suf- 
ficient number of bits, this is a conservative strategy and the actual 
SNR gap is found to be 2.3dB instead of 3dB. The simulations also 
suggest that keeping the number of bits fixed will result in rate loss 
which increases with SNR. Similar results are presented in Figure 
0b) for an TV = 3 system. 

4. ZERO FORCING VS. BLOCK DIAGONALIZATION 

Zero forcing is an even simpler strategy than BD, and it is important 
to compare the performance of these two schemes under the presence 
of limited feedback. Zero forcing for a MIMO broadcast system with 
K users and TV antennas per user is equivalent to a KN = M user 
system with a single antenna per user. The feedback scaling law for 
such a system is derived in 1 5 1 to be: 



BzF 



(M - 



(12) 



B 



to maintain an SNR gap of no more than 3 dB with respect to ZF 
under perfect CSIT conditions. In general, BD achieves a higher 
sum rate than ZF with perfect CSIT where the rate gap is Klog2 (e) 
X^^i ^^^Y^ 1 8 1 at high SNR. In order to compare the number of bits 
required for BD and ZF under imperfect CSIT and finite rate feed- 
back, it is necessary to fix a common target rate. The bits required 
per user for ZF must also be multiplied by TV for fair comparison. 
By setting 6 — 2^'+^ in <10t where Rg is the per user rate gap 
between BD and ZF with perfect CSIT and R the target per user rate 
loss for the ZF system, we can compare the sufficient number of bits 
required to achieve the same sum rate for both strategies. For exam- 
ple, R = 1 for a 3 dB target and this suggests a bit savings of 20% 
for an M = 6, TV = 2 system, and 25% for an M = 9, TV = 3 
system with BD. The scaling law in Theorem l3.2l is however highly 
conservative for large 6, and though it is possible to see that BD has 
a clear advantage in terms of the sufficient number of bits required, 
it is somewhat underestimated. If a ZF system is scaled to maintain 
a 3dB SNR gap relative to perfect CSIT and the number of feedback 
bits for BD is numerically determined to achieve the same sum rate, 
bit savings are about 40-50% for an T\f = 6, TV = 2 system. 

5. QUANTIZING THE CHANNEL 

The scaling law in Section lT2l was derived considering random code- 
books, which are impractical for real world applications. Although 
vector quantization codebooks can be designed for more practical 
systems it is likely to require very high complexity due to the large 
number of bits at each mobile. It is thus worthwhile to investigate 
low complexity scalar quantization schemes. We believe that simple 
scalar quantization methods are capable of achieving the same bit 
scaling rate as random codes, though they will incur a constant rate 
loss. 

The scalar quantization scheme is first presented for MISO sys- 
tems (based on the idea in f9l). A complex channel vector Hi — 
[Hi,...,Hm]" e C"^ ^ is first divided by one of its elements, say 
Hi , to yield AI — 1 complex elements. The phase of each of these el- 
ements is quantized separately and uniformly in the interval [—n, n]. 



The inverse tangents of the magnitudes, for example tan ^ 

are quantized uniformly in the interval [0, Nonuniform quan- 
tization based on the distribution of these random variables is also 
possible, but (sub-optimal) uniform quantization appears to be suf- 
ficient for the number of feedback bits to scale linearly with SNR 
with the same slope as with random codebooks. The total number of 
bits available to a user is assumed to be distributed equally among 
the phases and magnitudes of the AI — 1 elements as far as possible, 
and the remaining bits are randomly assigned. 

For MIMO systems, this is generalized to quantizing the mag- 
nitude and phase (in the same manner) of the lower (M — A'') x A*' 
entries of the matrix Hi([Ii ( Zi]Hi)~\ Ii is the x ^ identity 
matrix and Zi the N x ^ zero matrix. 

Although we do not offer an analytical proof that this scheme 
achieves the same bit scaling rate as random codebook quantization, 
we present simulation results that certainly suggest this. The bits 
for scalar quantization are scaled according to Equation <12t for an 
Af = 6, A'^ = 1 MISO system (Figure|2}. This maintains a constant 
gap with the perfect CSIT curve, although there is a 2.7 dB SNR loss 
with respect to random codebook quantization. More sophisticated 
scalar quantization methods may be able to reduce this gap as well, 
which indicates that simple scalar quantization schemes could per- 
form quite well. Similar results for MIMO systems are presented in 
FiguresQa) and0b) with 4 and 2 users respectively. 
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(b) MIMO Broadcast Channel with M = 6, N = 3,K = 2 

Fig. 1. Sum Rate with Block Diagonalization and finite rate feed- 
back 
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Fig. 2. Scalar quantization in MISO systems (M = 6, N = 1) 



